Psychometric properties of the International Society of Wheelchair Professionals’ basic manual wheelchair-service-provision knowledge Test Version 1 and development of Version 2

Introduction Valid and reliable scores from measurement tools to test competency in basic manual wheelchair-service-provision are needed to promote good practice and support capacity building. The International Society of Wheelchair Professionals’ (ISWP) Basic Test Version 1 in English, launched in 2015, is the most frequently used outcome measure tool to test basic manual wheelchair-service-provision knowledge and is part of an international certification process. Despite the wide acceptance and use of the test, its psychometric properties have not yet been established. The objectives of this study were 1) to evaluate the test’s psychometric properties, 2) to develop the test’s Version 2, and 3) to evaluate the content validity of the new version. Methods For Objective 1, methods from the Classical Test Theory were used to obtain items’ difficulty, item discrimination index and domains’ reliability. For Objective 2, a team of experts in wheelchair service delivery and education conducted a systematic qualitative review of the questions’ text and answers and updated them using evidence-based guidelines. For Objective 3, an external team reviewed the clarity, relevance and domain allocation of the developed items using a 4-point Likert scale. Descriptive statistics were used to describe and characterize the results for each objective. Item-content (I-CVI) and Scale-content (S-CVI) validity indexes were calculated to compute content validity. Results For Objective 1, all domains in the test were below the threshold for acceptable internal consistency reliability; 80% of the total test pool (116 items from the total pool of 145) did not meet the thresholds for item difficulty and index of discrimination suggested in the literature. Of the items in the Test, 78% could be responded to intuitively and 66% did not distinguish between test-takers who were knowledgeable in the content area and those who were not. For Objective 2, experts found concerns such as items being grouped in the wrong domain, being repeated, not using person-first language, and using terms inconsistently. Thirty-four (23.4%) items were dropped and 111 (76.5%) were updated. In addition, 61 new items were developed. Members re-categorized the items and proposed a new classification of subdomains. For Objective 3, good agreement between subject-matter experts was found; the S-CVI calculated using the I-CVIs related to item clarity was 84% while using the I-CVIs related to item relevance was 98%. Only 7 items (4.1%) were deemed to be in the wrong domain and 4 items (2.3%) were considered irrelevant and dropped. Conclusion The psychometric evidence in support of ISWP Basic Test Version 1 in English is suboptimal. A new set of items developed by experts in the field has shown excellent content validity. Ongoing assessments will be needed as ISWP Basic Test Version 2 is implemented and monitored.


Introduction
Valid and reliable scores from measurement tools to test competency in basic manual wheelchair-service-provision are needed to promote good practice and support capacity building. The International Society of Wheelchair Professionals' (ISWP) Basic Test Version 1 in English, launched in 2015, is the most frequently used outcome measure tool to test basic manual wheelchair-service-provision knowledge and is part of an international certification process. Despite the wide acceptance and use of the test, its psychometric properties have not yet been established. The objectives of this study were 1) to evaluate the test's psychometric properties, 2) to develop the test's Version 2, and 3) to evaluate the content validity of the new version.

Methods
For Objective 1, methods from the Classical Test Theory were used to obtain items' difficulty, item discrimination index and domains' reliability. For Objective 2, a team of experts in wheelchair service delivery and education conducted a systematic qualitative review of the questions' text and answers and updated them using evidence-based guidelines. For Objective 3, an external team reviewed the clarity, relevance and domain allocation of the developed items using a 4-point Likert scale. Descriptive statistics were used to describe and

Results
For Objective 1, all domains in the test were below the threshold for acceptable internal consistency reliability; 80% of the total test pool (116 items from the total pool of 145) did not meet the thresholds for item difficulty and index of discrimination suggested in the literature. Of the items in the Test, 78% could be responded to intuitively and 66% did not distinguish between test-takers who were knowledgeable in the content area and those who were not. For Objective 2, experts found concerns such as items being grouped in the wrong domain, being repeated, not using person-first language, and using terms inconsistently. Thirty-four (23.4%) items were dropped and 111 (76.5%) were updated. In addition, 61 new items were developed. Members re-categorized the items and proposed a new classification of subdomains. For Objective 3, good agreement between subject-matter experts was found; the S-CVI calculated using the I-CVIs related to item clarity was 84% while using the I-CVIs related to item relevance was 98%. Only 7 items (4.1%) were deemed to be in the wrong domain and 4 items (2.3%) were considered irrelevant and dropped.

Introduction
Capacity building in wheelchair-service-provision is a key component in promoting good practice and developing sustainable wheelchair provision systems worldwide [1]. To support competency development among wheelchair service personnel, valid evidence-based and reliable measurement tools are needed. A recent scoping review on wheelchair-service-provision education highlighted the availability and use of open-source training packages that have been supporting the development of competencies in wheelchair-service-provision [2]. In particular, the World Health Organization Wheelchair Service Training Package-Basic Level (WHO Basic Package) [3] (a training resource that supports wheelchair service delivery for people not requiring postural support) and the Wheelchair Skills Program (WSP) [4] (protocol related to the assessment and training of wheelchair skills) were the two most frequently reported training programs used to improve competency in wheelchair-service-provision among rehabilitation students, clinicians and lay health workers [2]. The evaluation of basic manual wheelchair-service-provision knowledge was the most frequently evaluated competency, and the International Society of Wheelchair Professionals (ISWP) Basic Manual Wheelchair-Service-Provision Knowledge Test Version 1 (ISWP Basic Test Version 1) was the most frequently used outcome measure [2,5]. Basic manual wheelchair-service-provision knowledge is needed to provide wheelchair services to people with mobility impairment who can sit upright without additional postural support [3]. Studies that have used the ISWP Basic Test Version 1 targeted rehabilitation students and clinicians in Colombia [6][7][8], India [9], Mexico [9], and the United States [10]. The ISWP Basic Test Version 1 was used to measure changes in manual wheelchair-service-provision knowledge after a training intervention [6,9,10] or manual wheelchair-service-provision knowledge among rehabilitation students [7,8]. Furthermore, a recent exploratory analysis revealed associations between the ISWP Basic Test Version 1 and test takers' education, motivation, and country income setting [11].
The ISWP Basic Test Version 1, launched in 2015, is an online multiple-choice test that was developed based exclusively on the WHO Basic Package [3]. The test includes 75 items taken from a total pool of 145 items that are organized into 7 domains of wheelchair-service-delivery knowledge: Assessment, Prescription, Fitting, Production, User training, Process and Maintenance and repairs [5]. The domains are aligned with the content of the WHO 8-steps of wheelchair service delivery [3,12] but they were grouped differently. In the ISWP Basic Test Version 1, the domain Process consists of Referral and appointment and Funding and ordering [5]. The domains have different weights based on the pre-set number of questions within each domain [5]. To reduce the likelihood of receiving the same question when taking the test multiple times, each domain has a pool of questions from which only a subset is drawn. The ISWP Basic Test Version 1 settings include: 1) random distribution of questions from each domain's pool of questions, 2) forced completion, requiring participants to complete the test in a onetime entry; and 3) immediate scoring of the test with the opportunity to review both correct and incorrect answers [5]. A total percentage of test scores of at least 70% is considered a passing grade.
As of November 2022, the ISWP Basic Test Version 1 has been taken by 6,619 test takers from 106 countries and is part of an international wheelchair-service-provision fee-based certification process (https://wheelchairnetwork.org/resource-library/iswp-tests/) that has been pursued by 367 people. The ISWP Basic Test Version 1 has been translated into 14 languages. The languages are requested by the community and undergo a formal forward-only translation process that includes a review by two bilingual subject matter experts that are specifically recruited for this task [13]. Recently, Rushton et. al [14] conducted a translation and adaptation of the ISWP Basic Test Version 1 from English to French-Canadian along with a preliminary evaluation of its internal consistency. Their results indicated a low degree of internal consistency between the items in the ISWP Basic Test and the translated French-Canadian version. The authors recommended evaluating the internal consistency of the ISWP Basic Test.
Despite the rapid uptake and use of the ISWP Basic Test Version 1, its psychometric properties have not been reported and the test has not been revised nor updated since its launch. The ISWP Basic Test lacks evidence of its content validity and reliability (e.g., internal consistency) to support the test's relevance for its intended purpose. The methods followed to develop the ISWP Basic Test have multiple limitations such as the exclusion of relevant evidence-based materials (e.g., the WSP) the lack of methodological details related to the development of items, selection of domains and assessment of the clarity and relevance of questions [5]. To enhance the quality of research and to ensure that a valid and reliable test is being used as part of an international wheelchair service provision certification process, the ISWP Basic Test Version 1 needs to be revised, evaluated, and updated considering its items' performance and the inclusion of other relevant training packages.
The three objectives of this project were: 1) to evaluate the psychometric properties (i.e., item difficulty, item discrimination index, and domains' internal consistency reliability) of the ISWP Basic Test Version 1; 2) to develop the ISWP Basic Test Version 2; and 3) to evaluate the content validity (i.e., item-and scale-content validity indices [I-CVI and S-CVI respectively]) of the items included in ISWP Basic Test Version 2. This study aims to improve a measurement tool to test wheelchair-service-provision competency and contribute to supporting good practice and capacity building.

Objective one: Evaluate the psychometric properties of the ISWP Basic Test
We analyzed the data from the ISWP Basic Test Version 1 in English hosted on Test.com from January 1st, 2015, to January 24, 2020. The inclusion criteria for the datasets considered were successful test attempts (i.e., test-takers who completed both the demographic and the multiple-choice sections), and first attempts by test-takers. All test-takers gave informed consent to share their test-scores and sociodemographic information for the purpose of this study. The study was approved by the University of Pittsburgh's Institutional Review Board (STUDY19100169).

Statistical approach
A) Investigate passing rates and possible relations between test-takers characteristics and total test scores. The total test scores were analyzed for normality of distribution using the Shapiro-Wilk test (W) and visually inspected for normal-distribution assumption using a histogram. Descriptive statistics, measures of central tendency, and measures of spread were calculated to summarize and describe the data and to determine the percentage of test-takers who passed the test. Significance was set at α = 0.05. All analyses were performed using STATA software, version 15. B) Investigate questions (items) and domains' performance. Methods from Classical Test Theory were used to evaluate the data. The following item-level statistics were obtained: • Item difficulty (p). This was the proportion of test-takers who answered an item correctly [15,16].

Number of people who correctly answer item j N
This proportion was obtained by dividing the number of test-takers who responded correctly to an item (j) by the total number of respondents (N). The p-values have a range from 0.00 to 1.00. Lower p-values are indicative of more difficult items while higher p-values suggest easier items [15]. Multiple-choice questions are affected by random guessing, the number of answer options and partial knowledge that test-takers may have and allow them to eliminate answer options before guessing [16]. To adjust for random guessing, acceptable p-values for items with 4-answer options are � 0.74 [16,17].
• Item discrimination index (IDI). This index allows differentiating between test-takers who know the criterion of interest and those who do not [15,16]. IDIs were obtained by organizing the group's total test scores in descending order and then grouping the upper and lower 27% [18]. After the groups had been identified, the IDI was obtained: IDI ¼ p u À p l P u is the proportion in the upper group who answer the item correctly and p l is the proportion in the lower group who answer correctly. IDIs have a range from -1.00 to 1.00. Positive values indicate that the value discriminates in favor of the upper group, and negative values are in favor of the lower-scoring group, indicating it is a reverse discriminator [16]. As suggested by the literature, IDIs <0. 30 were flagged for revision as these items are marginal and should be eliminated or completely revised [19].
• Domains' reliability. We used the Kuder-Richardson Formula 20 (KR-20) to estimate internal consistency. KR-20 is commonly used to test the internal consistency of an achievement test and the test measures the unidimensional trait [20][21][22].

Objective two: Develop the ISWP Basic Test Version 2
Committee formation. A purposive sampling method was used to recruit an international group of stakeholders to form the Assessment Tools Committee (ATC). ISWP sent 10 invitation letters describing the scope of the project, time commitment, and work effort. The inclusion criteria were at least 5 years of experience in wheelchair service delivery, familiarity with the WHO Basic Package and the WSP, being actively engaged in the wheelchair sector, have access to a computer with Internet connectivity and be fluent in English. Potential members who agreed to be part of the committee responded to a demographic survey that recorded socio-demographic characteristics; perceived familiarity with the content of the WHO and WSP materials; and perceived confidence in interpreting components of psychometric analysis, correlations, and research methodology. The purpose of the ATC was to conduct a qualitative review of the items in the ISWP Basic Test guided by the quantitative analysis from Objective 1.
Revision process. Before the start of the revision process, members of the ATC received training on the use of the assessment forms, the methodology for the revision of the items, and the evidence-based guidelines for the development and improvement of items. The training lasted two hours, was online, synchronous, and facilitated by the project lead (YBM) using Zoom (https://zoom.us/). The project lead was a physiotherapist and rehabilitation researcher with seven years of experience designing and implementing educational interventions related to improving wheelchair service provision competencies among rehabilitation professionals. After the ATC training, an online repository with relevant materials (i.e., articles, books, videos, and presentations) was made available for all ATC members. The WHO Basic Package [3] and the WSP [4] were the training packages used to review, update, and create new items. ATC members were encouraged to create new items when not all aspects of basic manual wheelchair-service-provision knowledge were covered by the current items. Seven teams, each consisting of 2-4 members of the ATC, were formed; each team was assigned to review one domain of the ISWP Basic Test (i.e., Assessment, Prescription, Fitting, Production, Users' training, Process, and Maintenance and repairs). Members of the ATC were purposefully assigned to the teams considering their area of expertise. In addition, a senior member (RLK) and the project lead (YBM) participated in all groups.
The revision process consisted of four phases: Phase 1: Analysis and revision. This phase included two sequential steps.
1. Qualitative revision of the questions' text and answers. ATC members evaluated the quality of the questions using the Question Guidelines Table (S1 Table), a matrix created by the research team that included evidence-based guidelines for developing questions' text and answers and prompted ATC members to determine if the questions met or not the criteria.

2.
Proposed changes to the question texts and answers. ATC members used the Domain Feedback Form (S2 Table) to analyze the results from the quantitative analysis (from Step 1) and propose changes. The Domain Feedback Form consisted of three sections: a. Section 1. Domain summary table. This table was informative and included the total number of items in the domain, the total number of items flagged for revision, and the domain's internal consistency reliability.
b. Section 2 & 3. Flagged and unflagged questions for revision. These sections grouped the questions that did and did not meet the thresholds established by the methods from Classical Test Theory (Objective 1). The tables included question texts and answers, pvalues, IDIs, and frequency and percentage of answer-options selections. An open section next to each question prompted ATC members to propose changes. ATC members were encouraged to use the Question Guidelines Table previously completed to address identified issues and to propose changes that included evidence-based guidelines. ATC members were asked to submit the Domains Feedback Form prior to Phase 2.
Phase 2: Online Team Synchronous Meetings. During the online teams' meetings, ATC members discussed the results from Phase 1 and the proposed changes until consensus was reached. When new items were created, members rated them using the Question Guidelines Table to ensure the new questions met evidence-based guidelines. Consensus was considered to have been reached when the majority of ATC members approved changes to existing items and the inclusion of new items. Once consensus was reached, a domain's pre-final draft with all items was created. The meetings were facilitated by the project lead on the Zoom platform, they were recorded and made available for members who were unable to attend.
• Phase 3: Pre-final version. The domains' pre-final drafts were distributed to all ATC members for their review. If ATC members disagreed with any item, they emailed their concerns to the project lead and met with the corresponding team that created the item. When consensus was reached, the pool of questions was finalized and considered ready for the last phase.
• Phase 4: Classification and final revision. The project lead and a senior member categorized all questions independently and proposed the domains' definitions using the WHO Basic Package and the WSP content as references. The new domains with their definitions and questions were distributed to all ATC members for their review and approval. When a consensus was reached, the final set of questions was ready for content validity analysis.

Objective three: Evaluate the content validity of the items included in ISWP Basic Test Version 2
Participant recruitment. A purposive sampling method was used to recruit at least 8 participants comprising researchers, educators and clinicians to form the Subject Matter Experts Group (SMEG). This sample size aligned with standard recommendations [23,24]. To be eligible to participate, experts had to have had at least 5 years of experience in wheelchair service delivery, be familiar with the WHO Basic Package and the WSP, be actively engaged in the wheelchair sector, have access to a computer with Internet connectivity, be fluent in English, and not be members of the ATC. Completing the Qualtrics survey served as participants' consent to participate in the study.
Survey development and administration. To evaluate content validity, a survey was created consisting of a socio-demographic questionnaire and the item pool of questions developed by the ATC. The survey asked participants to rate the clarity and relevance of each item on a 4-point Likert scale (1 = strongly disagree to 4 = strongly agree), provide open-ended feedback regarding content and wording immediately after rating each item and indicate if the item was representative of the allocated domain. The use of a 4-point Likert scale has been suggested in the literature to avoid a neutral midpoint [23]. Ratings were dichotomized as "agree" (for those who responded 3 [6] or 4 [6]) and "disagree" (for those who responded 2 [6] or 1 [6]), following the typical recommendation for computing content validity [24,25]. The survey was hosted in Qualtrics 1 , a web-based survey tool, and distributed online via an external link.
Members of the SMEG attended a one-hour synchronous meeting in which the scope of the project, instructions, and structure of the survey were explained. The survey remained open for four weeks and allowed participants to save their progress and complete the survey in multiple attempts. Participants who completed the survey received a letter from ISWP acknowledging their time and contributions, and they are listed in the Acknowledgments section of this paper.
Content validity assessment. Descriptive statistics were used to describe and characterize the sample. The I-CVI and the S-CVI were calculated from the item pool. As recommended in the literature, the I-CVI was calculated by taking the number of participants in agreement based on the dichotomized rating scale and dividing it by the total number of participants in the SMEG who answer the question [23,24]. The S-CVI was calculated by summing the I-CVIs and dividing it by the total number of items; this method is referred to in the literature as S-CVI/Average [23,24]. Acceptable agreement for the I-CVI was set as � 0.78 and for S-CVI at � 0.90 [23,24,26,27]. The ATC used the I-CVIs and the open-ended responses from the survey to guide them in revising, updating, or deleting the items.

Objective one: Evaluate the psychometric properties of the ISWP Basic Test Version 1
Investigate passing rates and possible relations between test-takers characteristics and total test scores. Table 1 includes the characteristics of the population, all total test-takers, and test-takers who passed and did not pass the test. A total of 1276 test attempts of the ISWP Basic Test Version 1 in English were completed between January 1, 2015, to January 24, 2020; 1108 (86.8%) were successful (i.e., completed both the demographic and multiple-choice sections) and represent 947 unique users on their first test attempts. Four users were removed from the analysis due to incomplete attempts that were not captured in the first exclusion. The sample size retained for analysis consisted of 943 test-takers, a near gender-balanced representation with a mean age of 35 years. Test-takers were geographically distributed in five continents (i.e., Asia, Africa, America, and Europe) with the majority located in Asia. The most frequent educational level was a bachelor's degree and Physical Therapist was the most frequently reported certification. More than half of the test-takers had less than 3 years of experience in wheelchair-service-provision by the time they took the test and most test-takers reported spending 3-20 hours per week in wheelchair service delivery. About three-quarters of the test-takers passed the ISWP Basic Test in their first attempt; in this group, the highest domain mean score was from Process and the lowest was from Fitting. About one-quarter of test-takers failed the ISWP Basic Test; in this group, the domain with the highest score was Assessment and the lowest was Fitting.

Investigate domains internal consistency reliability and questions performance.
None of the domains comprised in the ISWP Basic Test had a KR-20 coefficient � 0.80 (Table 2) indicating a weak relationship between items on each domain. Of the 145 multiple-choice questions that comprised the ISWP Basic Test's total pool, a total of 116 questions (80%) were flagged for revision as they did not meet the minimum thresholds for difficulty (i.e., pvalues � 0.74) and index of index of discrimination (i.e., values >0. 30). From the flagged questions, 90 questions (77.6%) had a p-value >0.74 with indicates questions are extremely easy; 76 questions had an IDI < 0.30 which implies those questions are not discriminating between test-takers who know the content and those who do not. Fifty-seven questions (49.1%) did not meet both criteria. The domains with the highest percentage of questions flagged for revision were Process (92%), Assessment (91.2%), and Prescription (88.2%). In contrast, the domain with the lowest percentage of questions flagged for revision was Production (50%). Only 29 questions (20%) from the total pool of ISWP Basic Test questions met the criteria (i.e., p-value and IDI) and were not flagged for revision. Table 2 presents the domains' internal consistency reliability and the questions' performance and their item-level statistics.

Objective two: Develop the ISWP Basic Test Version 2
Committee formation. All ten recruited stakeholders accepted our invitation to join the ATC. Table 3 shows their characteristics. The group was represented mainly by females with a mean age of about 45 years and an average of 16 years involved in the wheelchair sector. Most members had completed a graduate degree. Three members (from the Philippines, the USA and Colombia) dropped out due to the workload. Table 4 presents the average percentage of questions by sub-domain that met each criterion. Overall, the lowest average percentage of agreement across domains in the stem guidelines were 'stem is meaningful by itself and presents a definite problem' and 'stem does not contain irrelevant material' while from the answer options' guidelines were 'alternatives are free from clues' and 'alternatives are mutually exclusive'.

Revision process Phase 1: Analysis and revision. Qualitative revision of the question texts and answers.
Proposed changes to the questions' text and answers. In addition to the issues listed in Table 4, ATC members identified several concerns in the items such as 1) the misplacement of questions in the domains; 2) the repetition of questions; 3) the absence of person-first language; and 4) inconsistency in the use of terms. Considering these problems and the performance of questions and domains, the ATC decided to review all 145 items (both flagged and unflagged questions) that comprised the ISWP Basic Test item pool.

Phase 2 & 3: Team synchronous meeting and pre-final version.
A total of 10 two-hour meetings were held between ATC teams (n = 7) to review all questions (stem and answer options), their domain allocation, and to prepare the final set of questions for content-validity analysis. Members used the results from the qualitative review to guide the revision process. This procedure resulted in dropping 34 items (23.4%), updating 111 items (76.5%) and creating 61 new items. Table 5 presents the review details grouped by domain. The item revision involved updating the content and terminology considering evidence-based materials, complying with the questions guidelines and copy editing (grammar, flow, spelling, and punctuation). In addition, ATC members deleted duplicated questions and re-categorized questions when they were in the wrong domain.

Phase 4. Classification and final revision.
A total of 172 multi-choice questions were reclassified considering the WHO 8-steps and retained for content validity assessment. Table 6 includes the new proposed items' domain classification, the sub-domain definitions, and the number of questions per domain. The ATC decided to use the same sub-domains comprised in the WHO Basic Package with two caveats. One, due to the limited information provided in the manual regarding step 4: funding and ordering, and the variability of funding and ordering processes in international settings, this step was not included as a sub-domain to test basic manual wheelchair-service-provision knowledge. And two: step 3: prescription and step 5: product (wheelchair) preparation were combined in one sub-domain because the two steps were considered synergistic.

Objective three: Evaluate the content validity of the items included in ISWP Basic Test Version 2
Eight subject matter experts completed the survey. Their characteristics are documented in Table 7. The group had a gender-balanced representation with a mean age of about 40 years old and an average of 18 years involved in the wheelchair sector. Most members' primary occupation was clinically oriented with a focus on wheelchair-service-provision. The I-CVI for the 172 multi-choice questions that inquired about item clarity ranged from 0.38-1.00, with 66 items (38.4%) falling below the acceptable 78% threshold. In terms of item relevance, I-CVIs ranged from 0.57 to 1.00, with 4 items (2.3%) falling below the acceptable 78% threshold; these items did not meet the threshold in clarity either. Seven questions had <80% of agreement about their domain allocation. A good agreement between subject matter Table 6. Questions and domain classification for content validity.

Sub-domains
Definition N

Core Knowledge
General information about best practices to undertake the wheelchair-serviceprovision process. 8

Assessment
Refers to step 2 of the WHO 8 steps that includes the assessment interview and the physical assessment of the wheelchair user. The information collected in the interview includes the wheelchair user's contact information, physical condition, lifestyle and environment, and information about the existing wheelchair (if applicable). In the physical assessment, the provider identifies the presence, risk, or history of pressure injuries; identifies the user's method of pushing; obtains the user's body measurements that will help select the size of the wheelchair; and assesses current wheelchair skills 43

Prescription and Product Preparation
Refers to steps 3 (prescription) and 5 (product preparation) of the WHO 8 steps. The prescription includes the selection of the best available wheelchair and cushion for the user based on his/hers needs; the size of the wheelchair and cushion (considering the body measurements obtained in step 2); and the features (characteristics) of the wheelchair based on the user's needs. The product preparation includes wheelchair adjustments and revisions made by the provider to ensure that the wheelchair is safe and ready to be used. 38

Fitting
Refers to step 6 of the WHO 8 steps. The fitting includes reviewing that the wheelchair size and adjustments, the pressure, the posture, and the fit of the wheelchair user while moving are adequate; and the cushion is working properly.

User Training
Refers to step 7 of the WHO 8 steps. The user training includes teaching the wheelchair user and (if appropriate, the caregiver) wheelchair skills, transfers, prevention of pressure injuries, and cushion and wheelchair maintenance tasks.

Follow-up
Refers to step 8 of the WHO 8 steps. In the follow-up, the wheelchair provider checks the condition of the wheelchair, its fit, the need for additional training, and that the wheelchair still meets the user's needs.

PLOS ONE
experts was found; the S-CVI calculated using the I-CVIs related to item clarity was 84% while using the I-CVIs related to item relevance was 98%. Table 8 groups the content validity results by subdomains. ATC members reviewed and updated the items' flagged for clarity review and domain allocation using the feedback provided in the open-ended comments. The 4-items that did not meet the clarity and relevance threshold were dropped, leaving a total pool of 168 items.

Discussion
In this project, we reviewed and evaluated the psychometric properties of the ISWP Basic Test Version 1 in English, updated the test using a systematic approach guided by stakeholders to develop the ISWP Basic Test Version 2 and evaluated the content validity of this new version. The results from the revision indicate that only 20% of the questions included in the ISWP Basic Test Version 1 in English met the thresholds for difficulty and the index of discrimination considered in the literature as standard levels for effective testing [28,29]. The ISWP Basic Test Version 2 was developed considering the question and domain performance of Version 1 and guidelines for the development of question texts and answers that resulted in a test pool of 172 new items grouped in six sub-domains. Results from the content validity assessment showed a good agreement related to the items' clarity and relevance. The feedback provided by reviewers improved the items' clarity and relevance and retained 168 items for future pilot testing.

ISWP Basic Test Version 1: Psychometric properties
More than two-thirds of test-takers passed the test in their first attempt of the ISWP Basic Test Version 1 in English. This passing rate differs from the common perception of the need for more wheelchair-service-provision training worldwide [30] and the robust evidence reporting limited competency and education among entry-to-practice rehabilitation students [7,8,31] and clinicians [2,6,9,32]. It could be assumed that the high passing rates reflect a homogeneous and highly educated group on wheelchair-service-provision. However, the socio-demographic characteristics of the sample show a diverse group of participants. The sample was represented by test-takers from different continents with various educational levels that ranged from high school to doctorate degrees. Although most test-takers held a bachelor's degree in rehabilitation professions primarily responsible for wheelchair service delivery, it cannot be expected that their wheelchair-service-provision competence was due to their professional training. Evidence has emerged demonstrating that professional rehabilitation programs

PLOS ONE
worldwide (i.e., Physical Therapy, Occupational Therapy and Prosthetics and Orthotics) have very limited wheelchair-related education in their curricula [33][34][35]. As such, the passing rates may be associated with other wheelchair training. Unfortunately, the socio-demographic questionnaire did not explore further the characteristics of the training received. In the future, additional information can be obtained regarding test-takers previous wheelchair training. Based on the results of psychometric properties of the ISWP Basic Test Version 1 and the qualitative review of the test questions conducted in this project, we believe that the high pass rates do not necessarily represent test-takers competency in basic wheelchair-service-provision. Rather, they may represent problems with the test. None of the domains of the ISWP Basic Test Version 1 had the minimum internal consistency to presume a strong relationship between items. The negative KR-20 values in Fitting, Production and Follow-up presumed that one or more of the items in those domains are not performing properly. This may be due to the displacement of questions in the domains. Overall, the results from the internal consistency reliability suggest that the items grouped in each domain are not measuring what they intend to measure. Instead, they are measuring many unknown factors. At the item-level analysis, the results show that 80% of the total test pool did not meet the thresholds for item difficulty and index of discrimination suggested in the literature [16,28]. Most questions exceeded the recommended difficulty threshold for multiple-choice questions, indicating that more than twothirds of the questions could be responded to intuitively without adequate knowledge of wheelchair-service-provision. Similarly, about two-thirds of the questions were unable to distinguish between test-takers who are knowledgeable in the content area and those who are not [16]. These results suggest that the ISWP Basic Test Version 1, which has been used as part of an international wheelchair-service-provision certification process, does not discriminate between test-takers with basic knowledge of wheelchair-service-provision and those without it. Our results support the Rushton et al [14] assumption that the ISWP Basic Test may not be accurately measuring basic wheelchair-service-provision competency, and that the low degree of internal consistency between the translated French-Canadian version and the original version may be due to issues with the original test.

Development of the ISWP Basic Test Version 2
Sixty-one new items were developed for ISWP Basic Test Version 2, the largest number of which (n = 22) corresponds to the User training sub-domain. This was made possible by using the WSP, a gold standard training package, as a reference [4]. The questions added relate to indoor wheelchair skills, community skills, advanced skills and motor learning principles. Most of this content is not covered by the WHO Basic Package, but it is in the WSP manuals and training resources that are freely available online. We placed particular emphasis on increasing the pool of User training questions, as recent evidence reveals that 1) few professionals provide wheelchair skills training to their clients and caregivers [36] and 2) rehabilitation professionals receive limited to no training on wheelchair skills [35,37], despite the fact that such training has been found to be highly effective [38,39]. We hope that increasing the pool of this sub-domain draws attention to educators, trainers, and trainees on the importance of wheelchair skills training as part of the wheelchair service delivery process.
In the ISWP Basic Test Version 2, we have included the domains Core Knowledge and six of the WHO 8-steps (i.e., Assessment, Prescription, Product preparation, Fitting, User training, and Follow-up). The domain Core Knowledge was populated with many items previously grouped in the sub-domain Process. According to Gartz et al., the sub-domain 'Process' was created in the development of the ISWP Basic Test Version 1 with content from two of the WHO-8 steps., Referral and appointment and Funding and ordering [5]. However, the quality review of the Test's questions' text and answers revealed that many questions were misplaced in the domains and some, like those included in Process, did not correspond to the titled content. These findings increased the scope of the project and resulted in all questions being reclassified.
In terms of the WHO 8-steps, we decided not to include the steps of Referral and appointment and Funding and ordering because they lack sufficient content to create an adequate pool of questions. Without an adequate sample size per sub-domain, it is difficult to create items with a range of difficulty levels that allow to discriminate between test takers' knowledge [40]. Further, the two domains may be highly influenced by the context [1] and less pertinent to standardize in an international test. That being said, we recognize the importance of these steps and suggest reviewing the decision in the future using the data analysis from subsequent test results. Also, the WHO is in the process of revising its Guidelines and the classification of the components of wheelchair service delivery is likely to evolve based on experience and evidence acquired since its introduction in 2008 [12].
The proposed new classification of sub-domains of the ISWP Basic Test Version 2 is more aligned with the content of the WHO Basic Package. This classification recognizes that basic wheelchair-service-provision knowledge is the combination of two constructs: 1) Core Knowledge, which includes the background knowledge to undertake an appropriate wheelchair service delivery; and 2) Wheelchair Service Steps, the interconnected actions to provide an appropriate wheelchair service delivery [41]. We were particularly interested in keeping Core Knowledge as a domain because, in many places worldwide, wheelchair-service-provision is led by personnel without a healthcare professional degree (e.g., community-based workers, technicians, and local craftsmen) that may benefit from reviewing and assessing this section of the WHO Basic Package [3,41].
The developed set of questions had an excellent scale content validity index, presuming "strong conceptualizations of constructs, good items, judiciously selected experts, and clear instructions to experts regarding the underlying constructs and rating task" [24]. We received feedback on the clarity of 66 items (38.4%) via open-ended comments that guided the improvement of the questions' text. Only 4 (2.3%) items were deemed not relevant and were dropped from the final set. The proposed new classification of domains and items was well accepted by our group of subject matter experts who considered that only 7 (4.1%) items did not correspond to the proposed sub-domain.

Study limitations
A potential limitation of this study is that the feedback received from members of the ATC in the review and update of the items and the SMEG during the content-validity analysis is likely to include bias in their expertise and experience. We used a purposive and convenience sampling method to recruit members from both groups respectively and the groups may have not had experts from all steps of the wheelchair service delivery process.

Future directions
The next phase of this project includes 1) pilot test the questions, 2) evaluate the psychometric properties (i.e., item difficulty, item discrimination index, and domains' internal consistency reliability), 3) select the final question pool, and 3) determine the domains' weighting for the new ISWP Basic Test Version 2. Once the second phase of the project is complete, the ISWP Basic Test Version 2 will replace Version 1 and will be available to the general public. It is important to note that the ISWP Basic Test Version 2 is not parallel in form with Version 1. The questions included in the newest version have significantly changed, such that there was not a single question that was un-edited between versions and new questions were developed considering other training materials As such, the test sub-domains should not be compared across versions unless test equating is conducted [42].
ISWP is encouraged to allocate funding to complete this project promptly to offer a psychometrically sound test to measure basic wheelchair-service-provision competency. Considering that current and future versions of the Test are part of a fee-based certification process, we deemed it appropriate that stakeholders and subject matter experts be compensated for their time and contributions. Seven years passed between the launch of the ISWP Basic Test Version 1 and the revision of its psychometric properties. We recommend the development of a plan and task force to periodically review and update the Test in line with best practices in psychometric analysis including modern test theory and explore computer adaptive testing to ensure a fair and representative exam while minimizing test taker burden [43].

Conclusions
Valid and reliable scores from measurement tools for basic manual wheelchair-service-provision are necessary to support competency development and to promote good practice. The ISWP Basic Test is an international test that has been used since 2015 to evaluate basic manual wheelchair-service-provision knowledge. In spite of the study limitations and the need for further study and Test monitoring, this is the first study to explore the psychometric properties and conduct a qualitative review of the ISWP Basic Test Version 1. The results from the psychometric study revealed that the current version of the test lacks the minimum standard parameters for effective testing. The new items developed by experts in the field have shown excellent content validity. Future work will pilot test the items and determine the weighting of the subdomain for the new ISWP Basic Test Version 2.
Supporting information S1